Assessing the Quality and Cleaning of a Software Project Dataset: An Experience Report
نویسندگان
چکیده
OBJECTIVE The aim is to report upon an assessment of the impact noise has on the predictive accuracy by comparing noise handling techniques. METHOD We describe the process of cleaning a large software management dataset comprising initially of more than 10,000 projects. The data quality is mainly assessed through feedback from the data provider and manual inspection of the data. Three methods of noise correction (polishing, noise elimination and robust algorithms) are compared with each other assessing their accuracy. The noise detection was undertaken by using a regression tree model. RESULTS Three noise correctionmethods are compared and different results in their accuracy where noted. CONCLUSIONS The results demonstrated that polishing improves classification accuracy compared to noise elimination and robust algorithms approaches.
منابع مشابه
A Model-Driven Decision Support System for Software Cost Estimation (Case Study: Projects in NASA60 Dataset)
Estimating the costs of software development is one of the most important activities in software project management. Inaccuracies in such estimates may cause irreparable loss. A low estimate of the cost of projects will result in failure on delivery on time and indicates the inefficiency of the software development team. On the other hand, high estimates of resources and costs for a project wil...
متن کاملA measurement based software quality framework
In this report we propose a solution to problem of the dependency on the experience of the software project quality assurance personnel by providing a transparent, objective and measurement based quality framework. The framework helps the quality assurance experts making objective and comparable decisions in software projects by defining and assessing measurable quality goals and thresholds, di...
متن کاملAssessment of the effects of Megaprojects Implementation on the Quality of the Surrounding Environment (case study: Velayat Park in Tehran)
Development of science and technology and the growing trend of globalization has pushed the urban communities toward competition on absorbing foreign capitals and construction of macro-scale projects. This competition has affected not only developed countries but also developing countries such as Iran- since it is expected that each urban development will finally result in an increase in the qu...
متن کاملDeveloping a Risk Management Model for Banking Software Development Projects Based on Fuzzy Inference System
Risk management is one of the most influential parts of project management that has a major impact on the success or failure of projects. Due to the increasing use of information technology (IT) systems in all fields and the high failure rate of IT projects in software development and production, it is essential to effectively manage these projects is essential. Therefore, this study is aimed t...
متن کاملThe effect of data cleaning on record linkage quality
BACKGROUND Within the field of record linkage, numerous data cleaning and standardisation techniques are employed to ensure the highest quality of links. While these facilities are common in record linkage software packages and are regularly deployed across record linkage units, little work has been published demonstrating the impact of data cleaning on linkage quality. METHODS A range of cle...
متن کامل